Skip to content

Conversation

mhl-b
Copy link
Contributor

@mhl-b mhl-b commented Sep 10, 2024

Current default S3 repository settings are in conflict with S3 limit for multi-part upload. We split large files into 5TiB chunks with 100Mb part size. That means we need to send up to 50k parts, where S3 limit is 10k. https://docs.aws.amazon.com/AmazonS3/latest/userguide/qfacts.html

This PR changes default chunk_size to default_part_size * max_parts_number(10k), which should be in range of [50Gb, 5Tib]

@mhl-b mhl-b added >bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.16.0 labels Sep 10, 2024
@mhl-b mhl-b requested a review from DaveCTurner September 10, 2024 15:45
@mhl-b mhl-b marked this pull request as ready for review September 10, 2024 15:45
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@mhl-b mhl-b changed the title Change default s3 chunk_size Change default S3 chunk_size Sep 10, 2024
@mhl-b mhl-b changed the title Change default S3 chunk_size Change default S3 repository chunk_size Sep 10, 2024
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather we approached this differently: let's introduce a new repository setting for the maximum number of parts, defaulting to 10k, and then work out how to split each file up into blobs and parts dynamically based on the combination of max-parts, chunk_size and buffer_size. That way if users have set one or more of these values then we'll continue to respect their setting.

@mhl-b
Copy link
Contributor Author

mhl-b commented Oct 3, 2024

Closing in favour of #113989, using a new repository setting for the maximum number of parts

@mhl-b mhl-b closed this Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants